Improving predictions: Enhancing in-hospital mortality forecast for ICU patients with sepsis-induced coagulopathy using a stacking ensemble model

The incidence of sepsis-induced coagulopathy (SIC) is high, leading to increased mortality rates and prolonged hospitalization and intensive care unit (ICU) stays. Early identification of SIC patients at risk of in-hospital mortality can improve patient prognosis. The objective of this study is to develop and validate machine learning (ML) models to dynamically predict in-hospital mortality risk in SIC patients. A ML model is established based on the Medical Information Mart for Intensive Care IV (MIMIC-IV) database to predict in-hospital mortality in SIC patients. Utilizing univariate feature selection for feature screening. The optimal model was determined by calculating the area under the curve (AUC) with a 95% confidence interval (CI). The optimal model was interpreted using Shapley Additive Explanation (SHAP) values. Among the 3112 SIC patients included in MIMIC-IV, a total of 757 (25%) patients experienced mortality during their ICU stay. Univariate feature selection helps us to pick out the 20 most critical variables from the original feature. Among the 10 developed machine learning models, the stacking ensemble model exhibited the highest AUC (0.795, 95% CI: 0.763–0.827). Anion gap and age emerged as the most significant features for predicting the mortality risk in SIC. In this study, an ML model was constructed that exhibited excellent performance in predicting in-hospital mortality risk in SIC patients. Specifically, the stacking ensemble model demonstrated superior predictive ability.


Introduction
Sepsis refers to a clinical syndrome characterized by physiological, pathological, and biochemical abnormalities resulting from a dysregulated response of the organism to infection, leading to life-threatening organ dysfunction. [1]According to the Global Burden of Disease Study Analysis released in 2020, sepsis resulted in approximately 48.9 million cases globally, along with 11.0 million associated mortality cases. [2]epsis patients in the early stages can exhibit a complex prothrombotic state characterized by activation of the exogenous coagulation pathway, cytokine-induced coagulation amplification, inhibition of anticoagulant pathways, and fibrinolysis impairment. [3,4]The diagnostic criteria for sepsis-induced coagulopathy (SIC) aim to identify patients in the early stages who exhibit reversible alterations in coagulation status.The diagnostic criteria, formulated in 2017 by the members of the International Society on Thrombosis and Haemostasis Subcommittee on Disseminated Intravascular Coagulation (DIC) Scientific and Standardization Committee, comply with the updated definition of sepsis. [5]The SIC score comprises 3 components: platelet count, international normalized ratio (INR), and Sequential Organ Failure Assessment (SOFA) score.As a scoring system, a SIC score ≥ 4 indicates the diagnosis of sepsis-induced coagulopathy. [6]The study revealed that as the SIC score increases, the mortality risk demonstrates a linear escalation, with a mortality rate exceeding 45% at a score of 6. [5] In a study conducted in China, it was found that 67.9% of patients diagnosed with sepsis 3.0 also met the diagnostic criteria for SIC. [7]SIC, as a highly prevalent condition, currently lacks specific therapeutic interventions.It is characterized by a high mortality rate and poor prognosis, making it one of the common causes of mortality in the intensive care unit (ICU). [8]he progression from SIC to DIC is a continuum, with the pathological and physiological characteristics of SIC being characterized by a hypercoagulable state, which then transitions toward a hypocoagulable state during the DIC phase. [9]ence, early identification of mortality risk in SIC contributes to timely intervention, preventing further immune dysregulation and thereby reducing mortality rates.Given the time-sensitive nature of SIC treatment, early prediction of mortality risk in the early stages of SIC is crucial for improving survival rates in patients with SIC.
Machine learning, as a vital component of artificial intelligence, is a strategy that enables data to speak for itself as much as possible.It can overcome the limitations of traditional clinical statistical methods in interpreting highdimensional, non-linear, and longitudinal electronic medical record data. [10]In recent years, the clinical application scope of ML has expanded from diagnosis to prediction and has been utilized across various clinical domains.ML algorithms have also been employed for prognostication of critically ill patients. [11]Research indicates that machine learning algorithms can enable early and dynamic prediction of SIC based on medical data. [12]However, there is still a lack of effective tools in clinical practice for predicting the mortality risk of SIC.The aim of this study is to establish and validate a ML model for predicting in-hospital mortality rates in patients with SIC.

Data source
Extract data of SIC patients from the Medical Information Mart for Intensive Care IV (MIMIC-IV) open clinical database for the purpose of developing machine learning models.The MIMIC-IV database systematically collected data from sepsis patients in the ICUs of Beth Israel Deaconess Medical Center in Boston, Massachusetts, USA, from 2008 to 2019.This project has received approval from the Institutional Review Boards of the Massachusetts Institute of Technology and Beth Israel Deaconess Medical Center.In order to apply for access to this database, we have successfully completed an examination on the protection of human research participants and obtained a certificate (Certificate Number: 50618389).All health data of patients in this database have been de-identified, thereby obviating the need for obtaining informed consent from the patients.This study was conducted in accordance with the principles of the 2013 Helsinki Declaration.

Study population
Following are the inclusion criteria: Age ≥ 18 years; Initial hospitalization and first admission to the ICU; ICU length of stay > 1 day; Conforming to the diagnostic criteria of Sepsis 3.0, wherein sepsis is defined as a suspected infection accompanied by a rapid increase in SOFA score ≥ 2 1 ; SIC score ≥ 4 (Supplementary Table 1, http://links.lww.com/MD/L995),based on the worst daily values of SIC-related indicators during ICU hospitalization.The Conditions for exclusion are as follows: Patients with ≥ 2 admissions to the ICU; Data missing ≥ 20%.

Data collection and results
The variables collected in this study are based on 7 aspects: Demographic characteristics: age, gender; First care unit; Vital signs on the first day of ICU admission: heart rate, respiratory rate, mean blood pressure (MBP), body temperature, arterial oxygen saturation (SpO 2 ); Scoring scales on the first day of ICU admission: SOFA score; SIC score; Laboratory test results: hematocrit, hemoglobin, platelets, white blood cell, anion gap, bicarbonate, blood urea nitrogen (BUN), calcium, glucose, chloride, creatinine, sodium, potassium, absolute (Abs) basophils, Abs eosinophils, Abs lymphocytes, Abs monocytes, Abs neutrophils, INR, prothrombin time (PT), partial thromboplastin time; Complications: myocardial infarction, congestive heart failure, chronic pulmonary disease, diabetes, hypertension; Length of hospital ICU stay.We utilized laboratory test results and blood biomarker levels measured within the first day of ICU hospitalization.In cases where multiple measurements were taken within the first day, we used the minimum value for the respective indicators.
The outcome event for this study is the in-hospital mortality rate of SIC patients.

Statistical analysis
The distribution of continuous variables will be assessed using the Kolmogorov-Smirnov test.Parametric continuous variables will be assessed using t-tests and expressed as mean and standard deviation.Non-parametric continuous variables will be assessed using the Mann-Whitney U test and expressed as median with interquartile range.Categorical variables will be presented as numbers (percentages) and assessed using the χ² test or Fisher exact test.All statistical tests will be conducted as 2-tailed tests.SPSS software will be used for data computation and statistical analysis.The ML models and Receiver Operating Characteristic (ROC) curves were generated using R software (version 4.3.0).P < .05indicates statistical significance.

Feature engineering technique based on univariate feature selection
We used univariate feature selection technique to choose the optimal subset of predictive features.Univariate feature selection aims to reduce feature dimensionality and enhance model performance.This technique independently assesses the statistical relationships between each feature and the target variable, selecting features that provide the most information value for predicting the target variable. [13]We utilized the Support Vector Machine with the Gaussian kernel for modeling.Robustness and generalization of the model were ensured through 5-fold repeated cross-validation, and detailed training results were saved.Furthermore, the model parameters were tuned through 3 repetitions of cross-validation, utilizing ROC as the evaluation metric.We also enabled the functionality for class probability computation and probability prediction of the model.

Model development and validation
We selected Extreme Gradient Boosting (XGBoost), [14] Random Forest (RF), [15] k-Nearest Neighbors, [16] support vector machine, [17] Light Gradient Boosting Machine, [18] Decision Tree, Logistic Regression, [19] Elastic Net, [20] Single Hidden Layer Neural Network (SHLNN), [21] and a Stacking Ensemble Model (Elastic Net + SHLNN + XGBoost) [22] to construct the mortality risk prediction model.For this purpose, we performed a 10-fold cross-validation process on the input data.The performance of the ML models was assessed by calculating the area under the curve with a 95% CI, as well as accuracy, precision, recall, and F1 Score.By comparing the area under the curve (AUC) of each model, we determined the optimal model.We also conducted decision curve analysis and calibration curve analysis.The use of the Shapley Additive Explanation (SHAP) [23] algorithm provided interpretability for the optimal model, quantifying the contribution of each feature to the predictions made by the best model, while also analyzing 2 case studies.

Baseline characteristics and feature selection
Among the 14,804 sepsis patients recruited from the MIMIC-IV database, 3112 were diagnosed with SIC.A total of 757 patients experienced mortality during hospitalization, while 2355 patients did not.The case screening process is illustrated in Figure 1.The characteristics of SIC patients are presented in Table 1.
We utilized the univariate feature selection method to curate a subset of 20 features from the initial pool of 37 for the purpose of model development (Supplementary Figure 1, http:// links.lww.com/MD/L997).The selected features encompassed variables such as SOFA score, SIC score, age, hemoglobin, platelets, anion gap, bicarbonate, chloride, potassium, creatinine, BUN, INR, partial thromboplastin time, heart rate, MBP, respiratory rate, temperature, SpO 2 , myocardial infarction, and first care unit.

Comparison of 10 models
We ultimately divided the study population into a training set consisting of 2177 individuals (70%) and a test set consisting of 935 individuals (30%).The stacking ensemble model achieved the highest AUC in the testing set (0.795, 95% CI: 0.763-0.827),surpassing individual models such as Elastic Net (AUC: 0.768), SHLNN (AUC: 0.768), and XGBoost (AUC: 0.789) (Fig. 2A).To gain a deeper insight into the performance of these 10 models, we also measured their accuracy, precision, recall, and F1 Score, with the results listed in Table 2.The decision curve analysis curves and calibration curve demonstrate that the stacking ensemble model exhibits the most satisfactory predictability (Fig. 2B, Supplementary Figure 2, http://links.lww.com/MD/L998,).

Interpretability of the model
To gain a deeper understanding of how the ensemble model predicts mortality, we employed the SHAP algorithm to explain the model outcomes.The feature importance of the stacking ensemble model is depicted in the figure (Supplementary Figure 3, http://links.lww.com/MD/L999,).Among the 20 explanatory variables, anion gap was identified as the most important variable, followed by age, SpO 2 , and heart rate.We used SHAP summary plots to illustrate the overall positive and negative impacts of continuous and categorical variables on the output  of the stacking ensemble model.Among the categorical variables, first care unit contributes the most to the model value (Fig. 3).Among the continuous variables, anion gap contributes the most to the model value (Fig. 4).We randomly selected 2 samples and employed the SHAP analysis method to interpret the predictive results of the stacking ensemble model.The red and blue bars represent positive and negative effects, respectively; longer bars indicate higher functional importance.We conducted an interpretation using instance number 979, where heart rate = 107 and anion gap = 20 played a predominant positive role in predicting the outcome, while SpO 2 = 93 and age = 51 had a significant negative impact.The model output value was 0.34, surpassing the baseline of 0.25, and successfully predicted the patient as an in-hospital mortality case (Fig. 5A).For instance number 1174, age = 82 and platelets = 53 had a noteworthy positive impact on predicting the outcome.The final model output value was 0.14, which was lower than the baseline of 0.25, and the model successfully predicted the patient as a survivor (Fig. 5B). Figure 6 summarizes the univariate distributions of the top 4 continuous variables in the stacking ensemble model output.

Web development
We will deploy the stacking ensemble model we have constructed on a Shiny web page, making it easily accessible for clinical practitioners.The website can be accessed at the following link: https://saexgboost.shinyapps.io/SIC1/.Using this website, one can assess the mortality risk of SIC and display the predictive results to the users.

Discussion
Machine learning has found widespread applications in the medical field, revolutionizing healthcare by enhancing the efficiency and accuracy of diagnosis, treatment, and patient care. [24]n this article, we explore the effectiveness of machine learning in predicting the mortality risk of SIC patients and provide explanations for the best-performing model.
Our study demonstrates that the stacking ensemble model can predict in-hospital mortality risk in SIC more accurately than other algorithms.The ensemble model algorithm employs a Stacking approach, [25] combining 3 base models, namely   In this study, anion gap emerged as the most crucial variable in the clinical mortality prediction model for SIC, followed by vital signs (temperature and SpO 2 ).Renal function parameters such as BUN, SpO 2 , and potassium contribute to the assessment of SIC risk.Several studies have also indicated a close association between the above-mentioned indicators and the risk of mortality in SIC patients.Numerous studies have validated the association between serum anion gap and the mortality rate of critically ill patients, [26][27][28] with the anion gap serving as a risk factor for long-term extracorporeal support. [29]We have ample evidence to support the notion that the anion gap can serve as a predictive factor for forecasting the mortality rate among SIC patients.A retrospective singlecenter cohort study by Erez Marcusohn et al indicated an association between a body temperature > 39.5°C and adverse clinical course. [30]Serum creatinine and BUN are important indicators for evaluating renal function.Measuring these 2 parameters in the early stages of SIC patients can aid in assessing renal function, identifying sepsis-related acute kidney injury, and predicting disease progression and prognosis. [31]s an essential physiological parameter in the human body, SpO 2 serves to assess the circulatory system functionality and stands as a crucial monitoring indicator in early resuscitation protocols. [32]Lara Hessels et al's study revealed a close association between potassium disturbances and in-hospital mortality, which persisted even after adjusting for disease severity and acute kidney injury (AKI). [33]However, there is limited research available regarding the relationship between serum potassium levels and mortality in SIC patients.
To the best of our knowledge, this study represents the inaugural attempt at applying a stacking ensemble model to predict the mortality risk of patients with SIC.In comparison to extant solutions, our research introduces an innovative stacking ensemble model approach that successfully forecasts inhospital mortality risk among SIC patients in the ICU, thereby offering a valuable supplement to current clinical decisionmaking methods.The performance of our stacking ensemble model in mortality prediction significantly surpasses that of traditional machine learning methods, demonstrating its efficacy in discerning crucial clinical features among SIC patients.However, it is imperative to acknowledge certain limitations in this study.Despite the partitioning of the dataset into training and validation sets, the challenge of insufficient multicenter data for a more comprehensive model validation persists.Compared to recent advancements in sepsis management, the management of sepsis-induced coagulopathy still has a long way to go.

Conclusion
This study demonstrates the effectiveness of the stacking ensemble model in predicting in-hospital mortality risk among SIC patients.The anion gap plays a role in the in-hospital mortality risk of SIC patients.

Figure 2 .
Figure 2. (A) The discriminative ability of the 10 models was compared using ROC curves and AUC.(B) The performance of 10 machine learning models was evaluated using DCA across different decision thresholds.AUC = area under the curve, DCA = decision curve analysis, KNN = k-Nearest neighbors, Light GBM = light gradient boosting machine, LR = logistic regression, RF = random forest, SHLNN = single hidden layer neural network, SVM = support vector machine, XGBoost = extreme gradient boosting.

Figure 3 .
Figure 3. Feature importance of categorical variables.The yellow dots represent deceased samples, while the purple dots represent surviving samples.The horizontal axis represents the names of categorical variables, where 1 indicates presence, and 0 indicates absence.For "First care unit," 1 to 9 correspond to CCU, CVICU, MICU, MICU/SICU, Neuro intermediate, Neuro SICU, Neuro stepdown, SICU, and TSICU, respectively.

Figure 4 .
Figure 4. Hive plot.Continuous variables were ranked based on the sum of SHAP values across all patients, with SHAP values used to illustrate the distribution of the impact of each continuous variable on the output of the stacking ensemble model.SpO 2 = arterial oxygen saturation, BUN = blood urea nitrogen, INR = international normalized ratio, MBP = mean blood pressure, PTT = partial thromboplastin time, SIC = sepsis-induced coagulopathy, SOFA = sequential organ failure assessment.

Figure 5 .
Figure 5.The force plot illustrates interpretable examples of single-sample feature prediction results.Red indicates a positive impact on the model outcome, while blue denotes a negative impact.The length represents the importance of features.The baseline value (0.25) represents the average of the prediction model; the output value represents the predicted risk of death.The figure provides an interpretation of the death instance (A) and the survival instance (B).SpO 2 = arterial oxygen saturation, BUN = blood urea nitrogen, INR = international normalized ratio, MBP = mean blood pressure, PTT = partial thromboplastin time, SIC = sepsis-induced coagulopathy, SOFA = sequential organ failure assessment.

Figure 6 .
Figure 6.Univariate SHAP plots.Yellow represents deceased samples, while purple represents surviving samples.(A) Displays the positive impact of high anion gap values on the model prediction results.(B) Illustrates the positive impact of advanced age on prediction outcomes.(C) Indicates the negative influence of high SpO 2 values on the prediction results.(D) Demonstrates the positive impact of increased heart rate on the model prediction results.
The authors have no funding and conflicts of interest to disclose.
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.Written informed consent for participation was not required for this study.Supplemental Digital Content is available for this article.a Youjiang Medical University for Nationalities, Baise, China, b Baise People's Hospital, Baise, China, c Beijing Neurosurgical Institute, Beijing Tiantan Hospital, Capital Medical University, Beijing, China.

Table 1
Baseline characteristics between the survival group and death groups in the MIMIC-IV cohort.

Table 2
Performance comparison of models on the training dataset.